Skip to content

Fixes Issue #533 - Fix tuner dropping the first CLI argument.#536

Open
hliu-ai wants to merge 1 commit into
vwxyzjn:masterfrom
hliu-ai:fix-tuner-argv-drop
Open

Fixes Issue #533 - Fix tuner dropping the first CLI argument.#536
hliu-ai wants to merge 1 commit into
vwxyzjn:masterfrom
hliu-ai:fix-tuner-argv-drop

Conversation

@hliu-ai
Copy link
Copy Markdown

@hliu-ai hliu-ai commented Jan 12, 2026

Description

Fixes #533.

cleanrl_utils/tuner.py constructed sys.argv as a flags-only list, so the first tuned flag landed in sys.argv[0]. When runpy executes the target script, Python sets sys.argv[0] to the script path, overwriting whatever argument was placed there. This had the effect of erroneously holding the first tuned hyperparameter at its default value instead of what was desired across trials.

This change preserves sys.argv[0] and appends the constructed flags:
sys.argv = [sys.argv[0]] + algo_command + [...]

Verification:

  • Reproduced locally on Windows, Python 3.10.11. Before the change, the first tuned hyperparameter (learning rate) fell back to the script default while later flags (e.g., gamma) applied.
  • After the change, learning rate is passed correctly and varies per trial as expected.
  • Commit id: 004f8a0

Types of changes

  • Bug fix
  • New feature
  • New algorithm
  • Documentation

Checklist:

  • I've read the CONTRIBUTION guide (required).
  • I have ensured pre-commit run --all-files passes (required).
  • I have updated the tests accordingly (if applicable).
  • I have updated the documentation and previewed the changes via mkdocs serve.
    • I have explained note-worthy implementation details.
    • I have explained the logged metrics.
    • I have added links to the original paper and related papers.

If you need to run benchmark experiments for a performance-impacting changes:

  • I have contacted @vwxyzjn to obtain access to the openrlbenchmark W&B team.
  • I have used the benchmark utility to submit the tracked experiments to the openrlbenchmark/cleanrl W&B project, optionally with --capture_video.
  • I have performed RLops with python -m openrlbenchmark.rlops.
    • For new feature or bug fix:
      • I have used the RLops utility to understand the performance impact of the changes and confirmed there is no regression.
    • For new algorithm:
      • I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
    • I have added the learning curves generated by the python -m openrlbenchmark.rlops utility to the documentation.
    • I have added links to the tracked experiments in W&B, generated by python -m openrlbenchmark.rlops ....your_args... --report, to the documentation.

@vercel
Copy link
Copy Markdown

vercel Bot commented Jan 12, 2026

@hliu-ai is attempting to deploy a commit to the Costa Huang's projects Team on Vercel.

A member of the Team first needs to authorize it.

@hliu-ai
Copy link
Copy Markdown
Author

hliu-ai commented Jan 12, 2026

This is my very first ever PR so hopefully everything was done right! I'll continue looking for issues that I am capable of working on and keep learning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug in cleanrl_utils/tuner.py

1 participant